35 research outputs found

    The impact of sequence database choice on metaproteomic results in gut microbiota studies

    Get PDF
    Background: Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Results: Here, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a "merged" database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields. Conclusions: This study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources

    high throughput genomic and proteomic technologies in the fight against infectious diseases

    Get PDF
    New technologies have shown significant promise in the fight against infectious diseases, with the discovery of novel molecular targets for in vitro diagnostics and the improved design of vaccines. In developing countries, especially in areas of neglected diseases and resources-poor settings, a number of technological innovations are further needed, such as the integration of old and new biomarkers in suitable analysis platforms, the simplification of existing analysis systems, and the improvement of sample preservation and management. However, in these areas, identification of new biomarkers for infectious diseases is still a core issue in the diagnostic quest. Similarly, new technologies will allow scientists to design vaccines with improved immunogenicity, efficacy and safety in the local area, according to the circulating pathogenic strains and the genetic background of the population to be immunized. In this work we review the current omics-based technologies and their potential for accelerating the development of next generation vaccines and the identification of biomarkers suitable for point-of-care (POC) diagnostic applications

    Genomic analysis of Sardinian 26544/OG10 isolate of African swine fever virus

    Get PDF
    Abstract Comparative genomic analysis aims to underscore genetic assortment diversification in distinct viral isolates, to identify deletions and to carry out evolutionary studies. We sequenced the first complete genome of an ASFV p72 genotype I strain isolated from domestic pigs in Sardinia (Italy) using Next-Generation Sequence (NGS) technology. The genome is 182,906 bp long, contains 164 ORFs and has a 99.20% nucleotide identity to the L60 strain. Comparison analysis against the 16 ASFV genomes available in the database showed that 136 ORFs are present in nine ASFV isolates annotated to date. The most divergent ORFs codify for uncharacterized proteins such as X69R and DP96R, which have 51.3% and 70.4% nucleotide identity to the other isolates. A comparison between the Sardinian isolate and the avirulent isolates OURT 88/3, NHV, BA71V was also carried out. Major variations were found within the multigene families (MGFs) located in the left and right genome regions

    Evaluating the impact of different sequence databases on metaproteome analysis: insights from a lab-assembled microbial mixture

    Get PDF
    Metaproteomics enables the investigation of the protein repertoire expressed by complex microbial communities. However, to unleash its full potential, refinements in bioinformatic approaches for data analysis are still needed. In this context, sequence databases selection represents a major challenge. This work assessed the impact of different databases in metaproteomic investigations by using a mock microbial mixture including nine diverse bacterial and eukaryotic species, which was subjected to shotgun metaproteomic analysis. Then, both the microbial mixture and the single microorganisms were subjected to next generation sequencing to obtain experimental metagenomic- and genomic-derived databases, which were used along with public databases (namely, NCBI, UniProtKB/SwissProt and UniProtKB/TrEMBL, parsed at different taxonomic levels) to analyze the metaproteomic dataset. First, a quantitative comparison in terms of number and overlap of peptide identifications was carried out among all databases. As a result, only 35% of peptides were common to all database classes; moreover, genus/species-specific databases provided up to 17% more identifications compared to databases with generic taxonomy, while the metagenomic database enabled a slight increment in respect to public databases. Then, database behavior in terms of false discovery rate and peptide degeneracy was critically evaluated. Public databases with generic taxonomy exhibited a markedly different trend compared to the counterparts. Finally, the reliability of taxonomic attribution according to the lowest common ancestor approach (using MEGAN and Unipept software) was assessed. The level of misassignments varied among the different databases, and specific thresholds based on the number of taxon-specific peptides were established to minimize false positives. This study confirms that database selection has a significant impact in metaproteomics, and provides critical indications for improving depth and reliability of metaproteomic results. Specifically, the use of iterative searches and of suitable filters for taxonomic assignments is proposed with the aim of increasing coverage and trustworthiness of metaproteomic data.</br

    Molecular characterization of influenza A(H1N1)pdm09 virus circulating during the 2009 outbreak in Thua Thien Hue, Vietnam

    Get PDF
    Introduction: The influenza A(H1N1)pdm09 virus arrived in Vietnam in May 2009 via the United States and rapidly spread throughout the country. This study provides data on the viral diagnosis and molecular epidemiology of influenza A(H1N1)pdm09 virus isolated in Thua Thien Hue Province, central Vietnam. Methodology: Nasopharyngeal swabs and throat swabs from 53 clinically infected patients in the peak of the outbreak were processed for viral diagnosis by culture and RT-PCR. Sequencing of entire HA and NA genes of representative isolates and molecular epidemiological analysis were performed. Results: A total of 32 patients were positive for influenza A virus by virus culture and/or RT-PCR; of these 22 were positive both by viral isolation and RT-PCR, 2 only by virus culture and 8 only by RT-PCR. The novel subtype of influenza A(H1N1)pdm09 was present in 93.4% of the isolates. Phylogenetic analysis of the HA and NA gene sequences showed identities higher than 99.50% in both genes. They were also similar to reference isolates in HA sequences (&gt; 99% identity) and in NA sequences (6gt;98.50% identity). Amino acid sequences predicted for the HA gene were highly identical to reference strains. The NA amino acid substitutions identified did not include the oseltamivir-resistant H275Y substitution. Conclusion: viral isolation and RT-PCR together were useful for diagnosis of the influenza A(H1N1)pdm09 virus. Variations in HA and NA sequences are similar to those identified in worldwide reference isolates and no drug resistance was found.</br

    Metagenomics and microscope revealed T. trichiura and other intestinal parasites in a cesspit of an Italian nineteenth century aristocratic palace

    Get PDF
    This study evidenced the presence of parasites in a cesspit of an aristocratic palace of nineteenth century in Sardinia (Italy) by the use of classical paleoparasitological techniques coupled with next-generation sequencing. Parasite eggs identified by microscopy included helminth genera pathogenic for humans and animals: the whipworm Trichuris sp., the roundworm Ascaris sp., the flatworm Dicrocoelium sp. and the fish tapeworm Diphyllobothrium sp. In addition, 18S rRNA metabarcoding and metagenomic sequencing analysis allowed the first description in Sardinia of aDNA of the human specific T. trichiura species and Ascaris genus. Their presence is important for understanding the health conditions, hygiene habits, agricultural practices and the diet of the local inhabitants in the period under study

    International Coordination of Long-Term Ocean Biology Time Series Derived from Satellite Ocean Color Data

    Get PDF
    [ABSTRACT] In this paper, we will describe plans to coordinate the initial development of long-term ocean biology time series derived from global ocean color observations acquired by the United States, Japan and Europe, Specifically, we have been commissioned by the International Ocean Color Coordinating Group (IOCCG) to coordinate the development of merged products derived from the OCTS, SeaWiFS, MODIS, MERIS and GLI imagers. Each of these missions will have been launched by the year 2002 and will have produced global ocean color data products. Our goal is to develop and document the procedures to be used by each space agency (NASA, NASDA, and ESA) to merge chlorophyll, primary productivity, and other products from these missions. This coordination is required to initiate the production of long-term ocean biology time series which will be continued operationally beyond 2002. The purpose of the time series is to monitor interannual to decadal-scale variability in oceanic primary productivity and to study the effects of environmental change on upper ocean biogeochemical processes
    corecore